Text data augmentations: Permutation, antonyms and negation

نویسندگان

چکیده

Text has traditionally been used to train automated classifiers for a multitude of purposes, such as: classification, topic modelling and sentiment analysis. State-of-the-art LSTM classifier require large number training examples avoid biases successfully generalise. Labelled data greatly improves classification results, but not all modern datasets include numbers labelled examples. Labelling is complex task that can be expensive, time-consuming, potentially introduces biases. Data augmentation methods create synthetic based on existing examples, with the goal improving results. These have in image tasks recent research extended them text classification. We propose method uses sentence permutations augment an initial dataset, while retaining key statistical properties dataset. evaluate our eight different baseline Deep Learning process. This permutation significantly accuracy by average 4.1%. also two more augmentations reverse each augmented example, antonym negation. test these three eligible datasets, results suggest -averaged, across datasets-improvement 0.35% 0.4% negation, when compared proposed augmentation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Negation, Contrast and Contradiction in Text Processing

This paper describes a framework for recognizing contradictions between multiple text sources by relying on three forms of linguistic information: (a) negation; (b) antonymy; and (c) semantic and pragmatic information associated with the discourse relations. Two views of contradictions are considered, in which a novel method of recognizing contrast and of finding antonymies are described. Contr...

متن کامل

Decomposing Antonyms?

Are the marked members of antonym-pairs such as long – short decomposed in the syntax? Büring has recently argued that they are, on the basis of evidence about the distribution of Rullmann-ambiguities and crosspolar anomalies. But the readings of marked antonyms in the complements of matrix modals seem to argue for the opposite conclusion. The dilemma that results defies a simple solution. Perh...

متن کامل

Negation Detection in Swedish Clinical Text

NegEx, a rule-based algorithm that detects negations in English clinical text, was translated into Swedish and evaluated on clinical text written in Swedish. The NegEx algorithm detects negations through the use of trigger phrases, which indicate that a preceding or following concept is negated. A list of English trigger phrases was translated into Swedish, taking grammatical differences betwee...

متن کامل

Some Issues on Detecting Negation from Text

Negation is present in all human languages and it is used to reverse the polarity of parts of a statement. It is a complex phenomenon that interacts with many other aspects of language. Besides the direct meaning, negated statements often carry a latent positive meaning. Negation can be interpreted in terms of its scope and focus. This paper explores the importance of both scope and focus to ca...

متن کامل

Rule Learning with Negation for Text Classification

Classification rule generators that have the potential to include negated features in their antecedents are generally acknowledged to generate rules that have greater discriminating power than rules without negation. This can be achieved by including the negation of all features as part of the input. However, this is only viable if the number of features is relatively small. There are many appl...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Expert Systems With Applications

سال: 2021

ISSN: ['1873-6793', '0957-4174']

DOI: https://doi.org/10.1016/j.eswa.2021.114769